Build a Model with Google AutoML

Submission is Complete

Criteria Meet Specification

Is the submission complete?

All questions in the AutoML Modeling Report have been completed. Screenshots of all 4 confusion matrices are included.

Clean/Balanced Data

Criteria Meet Specification

Does the student understand the idea of splitting data into training and testing sets?

The student correctly reports the number of images used for training and testing.

Does the student understand what the confusion matrix shows?

The student should explain the values observed in each of the four cells of the confusion matrix. The student correctly identifies the true positive rate for the “pneumonia” class and the false positive rate for the “normal” class.

Does the student understand the meaning of precision and recall?

The student correctly explains the meaning of precision and recall, and they report the precision and recall they observed.

Does the student understand how the score threshold changes precision and recall?

The student correctly explains the effect of increasing the score threshold on precision and recall, and describes why.

Clean/Unbalanced Data

Criteria Meet Specification

Did the student correctly configure the unbalanced dataset?

The student should have used 400 images (100 normal class and 300 pneumonia class) and correctly described how they are distributed between training and testing.

Does the student understand how unbalanced classes impacted the confusion matrix?

The student describes how the confusion matrix changed relative to the clean/balanced model, and explains what potentially caused these results.

Does the student understand how precision and recall changed?

The student reports the precision and recall they observed.

Does the student understand the impact of unbalanced data overall?

The student should note how unbalanced data impacted the model based on what they observed.

Dirty/Balanced Data

Criteria Meet Specification

Does the student understand how dirty data impacted the confusion matrix?

The student describes how the confusion matrix changed, and explains what potentially caused these results.

Does the student understand how precision and recall changed? Does the student understand how the different datasets impacted precision and recall?

The student describes how precision and recall changed in this model, and evaluates which binary classification model produced the highest precision and recall.

Does the student understand the impact on dirty data?

The student provides a summary of what they observed and an appropriate interpretation of the impact of dirty data.

Three-Class Model

Criteria Meet Specification

Does the student understand the 3-class confusion matrix? Does the student understand how the model might be improved?

The student provides and correctly interprets the 3-class confusion matrix. The student provides an idea for how to improve the model.

Does the student understand how 3-class precision and recall are calculated?

The student reports their precision and recall, and correctly reports how 3-class precision and recall are calculated.

Does the student understand how to calculate F1 score?

The student correctly calculates the model’s F1 score.

Tips to make your project standout:

  • Train subsequent versions of the 3-class model to find how much data is needed to achieve > 90% precision and recall for all classes.
  • Find another image classification dataset on Kaggle and train models with various data splits.
  • Integrate with the API and run inferences on 1000 images and produce their own confusion matrix with at least one of the models.